48 research outputs found
Localization of JPEG double compression through multi-domain convolutional neural networks
When an attacker wants to falsify an image, in most of cases she/he will
perform a JPEG recompression. Different techniques have been developed based on
diverse theoretical assumptions but very effective solutions have not been
developed yet. Recently, machine learning based approaches have been started to
appear in the field of image forensics to solve diverse tasks such as
acquisition source identification and forgery detection. In this last case, the
aim ahead would be to get a trained neural network able, given a to-be-checked
image, to reliably localize the forged areas. With this in mind, our paper
proposes a step forward in this direction by analyzing how a single or double
JPEG compression can be revealed and localized using convolutional neural
networks (CNNs). Different kinds of input to the CNN have been taken into
consideration, and various experiments have been carried out trying also to
evidence potential issues to be further investigated.Comment: Accepted to CVPRW 2017, Workshop on Media Forensic
Tracing images back to their social network of origin: A CNN-based approach
Recovering information about the history of a digital content, such as an image or a video, can be strategic to address an investigation from the early stages. Storage devices, smart-phones and PCs, belonging to a suspect, are usually confiscated as soon as a warrant is issued. Any multimedia content found is analyzed in depth, in order to trace back its provenance and, if possible, its original source. This is particularly important when dealing with social networks, where most of the user-generated photos and videos are uploaded and shared daily. Being able to discern if images are downloaded from a social network or directly captured by a digital camera, can be crucial in leading consecutive investigations. In this paper, we propose a novel method based on convolutional neural networks (CNN) to determine the image provenance, whether it originates from a social network, a messaging application or directly from a photo-camera. By considering only the visual content, the method works irrespective of an eventual manipulation of metadata performed by an attacker. We have tested the proposed technique on three publicly available datasets of images downloaded from seven popular social networks, obtaining state-of-the-art results
Counter-forensics of SIFT-based copy-move detection by means of keypoint classification
Copy-move forgeries are very common image manipulations that are often carried out with malicious intents. Among the techniques devised by the 'Image Forensic' community, those relying on scale invariant feature transform (SIFT) features are the most effective ones. In this paper, we approach the copy-move scenario from the perspective of an attacker whose goal is to remove such features. The attacks conceived so far against SIFT-based forensic techniques implicitly assume that all SIFT keypoints have similar properties. On the contrary, we base our attacking strategy on the observation that it is possible to classify them in different typologies. Also, one may devise attacks tailored to each specific SIFT class, thus improving the performance in terms of removal rate and visual quality. To validate our ideas, we propose to use a SIFT classification scheme based on the gray scale histogram of the neighborhood of SIFT keypoints. Once the classification is performed, we then attack the different classes by means of class-specific methods. Our experiments lead to three interesting results: (1) there is a significant advantage in using SIFT classification, (2) the classification-based attack is robust against different SIFT implementations, and (3) we are able to impair a state-of-the-art SIFT-based copy-move detector in realistic cases
DepthFake: a depth-based strategy for detecting Deepfake videos
Fake content has grown at an incredible rate over the past few years. The
spread of social media and online platforms makes their dissemination on a
large scale increasingly accessible by malicious actors. In parallel, due to
the growing diffusion of fake image generation methods, many Deep
Learning-based detection techniques have been proposed. Most of those methods
rely on extracting salient features from RGB images to detect through a binary
classifier if the image is fake or real. In this paper, we proposed DepthFake,
a study on how to improve classical RGB-based approaches with depth-maps. The
depth information is extracted from RGB images with recent monocular depth
estimation techniques. Here, we demonstrate the effective contribution of
depth-maps to the deepfake detection task on robust pre-trained architectures.
The proposed RGBD approach is in fact able to achieve an average improvement of
3.20% and up to 11.7% for some deepfake attacks with respect to standard RGB
architectures over the FaceForensic++ dataset.Comment: 2022 ICPR Workshop on Artificial Intelligence for Multimedia
Forensics and Disinformation Detectio
A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking
Vision Transformer (ViT) architectures are becoming increasingly popular and
widely employed to tackle computer vision applications. Their main feature is
the capacity to extract global information through the self-attention
mechanism, outperforming earlier convolutional neural networks. However, ViT
deployment and performance have grown steadily with their size, number of
trainable parameters, and operations. Furthermore, self-attention's
computational and memory cost quadratically increases with the image
resolution. Generally speaking, it is challenging to employ these architectures
in real-world applications due to many hardware and environmental restrictions,
such as processing and computational capabilities. Therefore, this survey
investigates the most efficient methodologies to ensure sub-optimal estimation
performances. More in detail, four efficient categories will be analyzed:
compact architecture, pruning, knowledge distillation, and quantization
strategies. Moreover, a new metric called Efficient Error Rate has been
introduced in order to normalize and compare models' features that affect
hardware devices at inference time, such as the number of parameters, bits,
FLOPs, and model size. Summarizing, this paper firstly mathematically defines
the strategies used to make Vision Transformer efficient, describes and
discusses state-of-the-art methodologies, and analyzes their performances over
different application scenarios. Toward the end of this paper, we also discuss
open challenges and promising research directions
Removal and injection of keypoints for SIFT-based copy-move counter-forensics
Recent studies exposed the weaknesses of scale-invariant feature transform (SIFT)-based analysis by removing keypoints without significantly deteriorating the visual quality of the counterfeited image. As a consequence, an attacker can leverage on such weaknesses to impair or directly bypass with alarming efficacy some applications that rely on SIFT. In this paper, we further investigate this topic by addressing the dual problem of keypoint removal, i.e., the injection of fake SIFT keypoints in an image whose authentic keypoints have been previously deleted. Our interest stemmed from the consideration that an image with too few keypoints is per se a clue of counterfeit, which can be used by the forensic analyst to reveal the removal attack. Therefore, we analyse five injection tools reducing the perceptibility of keypoint removal and compare them experimentally. The results are encouraging and show that injection is feasible without causing a successive detection at SIFT matching level. To demonstrate the practical effectiveness of our procedure, we apply the best performing tool to create a forensically undetectable copy-move forgery, whereby traces of keypoint removal are hidden by means of keypoint injection
Diffusion Models for Earth Observation Use-cases: from cloud removal to urban change detection
The advancements in the state of the art of generative Artificial
Intelligence (AI) brought by diffusion models can be highly beneficial in novel
contexts involving Earth observation data. After introducing this new family of
generative models, this work proposes and analyses three use cases which
demonstrate the potential of diffusion-based approaches for satellite image
data. Namely, we tackle cloud removal and inpainting, dataset generation for
change-detection tasks, and urban replanning.Comment: Presented at Big Data from Space 2023 (BiDS
A Feature-Based Forensic Procedure for Splicing Forgeries Detection
Nowadays, determining if an image appeared somewhere on the web or in a magazine or is authentic or not has become crucial. Image forensics methods based on features have demonstrated so far to be very effective in detecting forgeries in which a portion of an image is cloned somewhere else onto the same image. Anyway such techniques cannot be adopted to deal with splicing attack, that is, when the image portion comes from another picture that then, usually, is not available anymore for an operation of feature match. In this paper, a procedure in which these techniques could also be employed will be shown to get rid of splicing attack by resorting to the use of some repositories of images available on the Internet like Google Images or TinEye Reverse Image Search. Experimental results are presented on some real case images retrieved on the Internet to demonstrate the capacity of the proposed procedure